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METHODS FOR REMOVAL OF DOUBLE-STRANDED 
OLIGONUCLEOTIDES CONTAINING SEQUENCE ERRORS 
USING MISMATCH RECOGNITION PROTEINS 

BACKGROUND OF THE INVENTION 

5 Field of the Invention 

The present invention in certain embodiments is directed toward the 
removal of double-stranded oligonucleotides containing sequence errors. It is more 
particularly related to the removal of error-containing oligonucleotides (such as error- 
containing double-stranded DNA), generated for example by chemical or enzymatic 

10 synthesis (including by PCR amplification), by removal of mismatched duplexes using 
mismatch recognition proteins. 

Description of the Related Art 

For purposes of this application, DNA is used as a prototypical example of 
an oligonucleotide. Mismatches &re formed directly during chemical DNA synthesis or are 

15 formed in enzymatically synthesized DNA 4 by denaturing and reannealing a mixed 
population of correct and error-containing DNA. 

In chemical DNA synthesis, the mismatches originate during the synthesis 
of oligonucleotides ("oligos")- These oligos are used as building blocks for DNA synthesis 
and are synthesized as single strands using automated oligonucleotide synthesizers. 

20 Random chemical side reactions create base errors in these single-stranded oligos. When 
two complementary synthetic oligos are hybridized to form double-stranded DNA, there is 
almost no chance that the random base errors formed in one strand will be correctly base 
paired in the opposite strand. It is these incorrectly paired bases that form the mismatches 
found in chemically synthesized double-stranded DNA. 

25 In enzymatic DNA synthesis, an enzyme (such as a polymerase) is used to 

amplify or assemble from a synthetic DNA template. This template contains the same type 
of base mismatches that are found in the synthetic DNA described above. However, once 
this DNA is amplified, the mismatches are converted into base paired errors in sequence. 
These base pairings of the mismatches occur as polymerase synthesizes the complementary 

30 base on the strand opposing strand. The result of this enzymatic step is to create a mixed 
population of DNA molecules where all bases are paired correctly with both correct (error- 
free) and incorrect (error-containing) sequences. The polymerase step essentially 
maintains the ratio of correct to incorrect sequence. 
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A DNA population such as that formed from enzymatic DNA synthesis 
containing both error-free and error-containing base paired DNA where both are correctly 
base pair matched, can be converted to a population composed of both mismatched and 
error-free correctly base paired DNA by denaturation and reannealing. When these steps 

5 are performed on a population that contains a small fraction of error-containing molecules 
relative to correct molecules, the vast majority of error containing strands will hybridize 
with the more abundant correct strand and will form mismatched sites. 

Moreover, even if the errors represent ahigh fraction of the population (e.g. , 
50%) denaturation and reannealing of a DNA population to itself, will result in the vast 

1 0 maj ority of a particular error-containing strand hybridizing either to a correct strand or to a 
strand that contains a distinct error. Thus, a population of DNA will be converted into two 
populations of mostly base paired correct DNA. The correct strands will find correct 
strand complementary strands and form perfectly base paired duplexes. 

Due to the difficulties in the current approaches to the preparation or 

1 5 amplification of oligonucleotides, such as genes, there is a need in the art for methods for 
improving the removal of double-stranded oligonucleotides containing sequence errors. 
The present invention fills this need, and further provides other related advantages. 

BRIEF SUMMARY OF THE INVENTION 

Briefly stated, in certain embodiments the present invention provides a 

20 variety of methods for removing double-stranded oligonucleotide (e.g. , DNA) molecules 
containing one or more sequence errors generated during nucleic acid synthesis, from a 
population of correct oligonucleotide duplexes. In one embodiment, the oligonucleotides 
are generated enzymatically. Heteroduplex oligonucleotides may be created by denaturing 
and reannealing the population of duplexes. The reannealed oligonucleotide duplexes are 

25 contacted with a mismatch recognition protein that interacts with the duplexes containing a 
base pair mismatch. The oligonucleotide heteroduplexes that have interacted with the 
protein are separated from homoduplexes as the latter do not interact with the protein. 
These methods are also used to remove heteroduplex oligonucleotides (e.g. , DNA) that are 
formed directly from chemical nucleic acid synthesis. 

30 In one embodiment, the present invention provides a method of depleting in 

a sample of double-stranded oligonucleotides a population of double-stranded 
oligonucleotides containing mismatched bases thereby enriching in said sample a 
population of double-stranded oligonucleotides containing correctly matched bases, 
comprising the steps of: (a) contacting said sample containing double-stranded 

35 oligonucleotides with a mismatch recognition protein under conditions to permit the 
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protein to interact with a double-stranded oligonucleotide containing at least one 
mismatched base; and (b) collecting double-stranded oligonucleotides that have not 
interacted with said mismatch recognition protein, thereby depleting the population of 
double-stranded oligonucleotides containing mismatched bases. In another embodiment, 

5 there is, prior to the step of collecting, an additional step comprising separating said 
double-stranded oligonucleotide containing at least one mismatched base that has 
interacted with said mismatch recognition protein, from double-stranded oligonucleotides 
that have not interacted with said mismatch recognition protein. 

These and other aspects of the present invention will become evident upon 

1 0 reference to the following detailed description. In addition, various references are set forth 
herein. Each of these references is incorporated herein by reference in its entirety as if 
each was individually noted for incorporation. 

DETAILED DESCRIPTION OF THE INVENTION 

Prior to setting forth the invention, it may be helpful to an understanding 
1 5 thereof to set forth definitions of certain terms to be used hereinafter. 

Natural bases of DNA - adenine (A), guanine (G), cytosine (C) and 
thymine (T). In RNA, thymine is replaced by uracil (U). > 

Synthetic double-stranded oligonucleotides - two strands of 
oligonucleotides (e.g. , substantially double-stranded DNA) composed of single strands of 
20 oligonucleotides synthetically produced (e.g., by chemical synthesis or by the ligation of 
synthetic double-stranded oligonucleotides to other synthetic double-stranded 
oligonucleotides to form larger synthetic double-stranded oligonucleotides) and joined 
together in the form of a duplex. 

Synthetic failures - undesired products of oligonucleotide synthesis; such as 
25 side products, truncated products or products from incorrect ligation. 

Side products - chemical byproducts of oligonucleotide synthesis. 

Truncated products - all possible shorter than the desired length 
oligonucleotide, e.g., resulting from inefficient monomer coupling during synthesis of 
oligonucleotides. 

30 TE - an aqueous solution of 10 mM Tris and 1 mM EDTA, at a pH of 8.0. 

Homoduplex oligonucleotides - double-stranded oligonucleotides wherein 
the bases are fully matched; e.g. , for DNA, each A is paired with a T, and each C is paired 
with a G. 
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Heteroduplex oligonucleotides - double-stranded oligonucleotides wherein 
the bases are mispaired, le. , there are one or more mismatched bases; e.g. 9 for DNA, an A 
is paired with a C, G or A, or a C is paired with a C, T or A, etc. 

Mismatch recognition protein - a protein that recognizes heteroduplex 

5 oligonucleotides (e.g., heteroduplex DNA); typically the protein is a mismatch repair 
enzyme or other oligonucleotide binding protein (e.g. , DNA mismatch repair enzyme or 
other DNA binding protein); the protein may be isolated or prepared synthetically (e.g., 
chemically or enzymatically), and may be a derivative, variant or analog, including a 
functionally equivalent molecule which is partially or completely devoid of amino acids. 

1 o The present invention is directed in certain embodiments toward methods 

for the removal of error-containing double-stranded oligonucleotide (e.g. , DNA) molecules 
from a population of double-stranded oligonucleotides (e.g. , that are produced by chemical 
or enzymatic synthesis). The error-containing oligonucleotide molecules in this population 
are removed from the correct molecules when the errors are present as mismatches in the 

15 double-stranded oligonucleotides. The removal of the mismatch is based in the present 
invention on the use of mismatch recognition proteins that recognize mismatched bases in 
double-stranded oligonucleotides. Such proteins interact with double-stranded 
oligonucleotides containing mismatched bases (e. g. , by binding and/or cleaving on or near 
the mismatch site). The protein step may or may not be performed in conjunction with a 

20 separation step (e.g. , chromatographic step) to separate mismatch-containing heteroduplex 
from homoduplex oligonucleotides. It is to be understood that the methods of the invention 
have the capability of mismatch removal regardless of the way the mismatch was created in 
the population. 

More specifically, the disclosure of the present invention shows surprisingly 
25 that mismatch recognition proteins may be used to deplete an oligonucleotide population of 
those double-stranded oligonucleotides which contain sequence errors. Depletion of 
error-containing oligonucleotides from the desired double-stranded oligonucleotides refers 
generally to at least about (wherein "about" is within 10%) a two-fold depletion relative to 
the total population prior to separation. Typically, the depletion will be a change of about 
30 two-fold to three-fold from the original state. The particular fold depletion may be the 
result of a single use of the method (e.g. , single separation) or the cumulative result of a 
plurality of use (e.g., two or more separations). Depletion of error-containing 
oligonucleotides is useful, for example, where the oligonucleotides are double-stranded 
DNA which correspond to a gene or fragments of a gene. 
35 Oligonucleotides suitable for use in the present invention are any 

double-stranded sequence. Examples of such oligonucleotides include double-stranded 
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DNA, double-stranded RNA, DNA/RNA hybrids, and functional equivalents containing 
one or more non-natural bases. Preferred oligonucleotides are double-stranded DNA. 
Double-stranded DNA includes full length genes and fragments of full length genes. For 
example, the DNA fragments may be portions of a gene that when joined form a larger 
5 portion of the gene or the entire gene. 

As noted above, the present invention provides a preparative method to 
remove base mismatched oligonucleotides from a population of correctly base matched 
oligonucleotides. The method generally comprises the steps of contacting a double- 
stranded oligonucleotide sample with a mismatch recognition protein, and collecting the 
1 0 double-stranded oligonucleotides that have not interacted with the mismatch recognition 
protein. Collecting the double-stranded oligonucleotides that have not interacted with the 
protein can be the result of their removal from the sample, or the removal from the sample 
of those oligonucleotides that did interact. The step of contacting is performed under 
conditions (including a time sufficient) to permit a mismatch recognition protein to interact 
1 5 with (e.g. , bind to and/or cleave) mismatch-containing heteroduplex oligonucleotides. The 
method may, prior to the step of collecting, optionally include a step of separating the 
double-stranded oligonucleotide that contains at least one (one or more) mismatched base 
and that has interacted with the mismatch recognition protein, from double-stranded 
oligonucleotides that have not interacted with the mismatch recognition protein. The 
20 method may, in place of or in addition to a separation step and prior to the step of 
contacting, optionally include steps of first denaturing and then reannealing a sample of 
double-stranded oligonucleotides under conditions to permit conversion of the double- 
stranded oligonucleotides first to single-stranded oligonucleotides and then to double- 
stranded oligonucleotides. It will be evident to one of ordinary skill in the art that the steps 
25 may be performed sequentially, or two or more steps may be performed simultaneously. 
For example, in an embodiment where a mismatch recognition protein is immobilized on a 
solid support, the step of contacting results directly in separation. 

In one embodiment the mismatch recognition proteins share the property of 
binding on or within the vicinity of a mismatch. Such a protein reagent includes proteins 
30 that are endonucleases, restriction enzymes, ribonucleases, mismatch repair enzymes, 
resolvases, helicases, ligases and antibodies specific for mismatches. Variants of these 
proteins can be produced, for example, by site directed mutagenesis, provided that they are 
functionally equivalent for mismatch recognition. The enzyme can be selected, for 
example, from T4 endonuclease 7, T7 endonuclease 1, SI, mung bean endonuclease, 
35 MutY, MutS, MutH, MutL, cleavase, and HINF1 . In another embodiment of the invention, 
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the mismatch recognition protein cleaves at least one strand of the mismatched DN A in the 
vicinity of the mismatch site. 

The optional separation step can be performed in a variety of means, e.g., 
using high performance liquid chromatography (HPLC), by size exclusion 
5 chromatography, ion exchange chromatography, affinity chromatography or reverse phase 
chromatography. The separation can also be performed using membranes in a slot blot 
fashion or a microtiter filter plate. The separation may also be performed using solid phase 
extraction cartridges using supports similar to the HPLC columns. 

In one embodiment, a mismatch recognition protein (e.g. , the MutS protein 
10 from E. coli) is immobilized on a solid support. Methods for immobilizing proteins on 
solid supports are well known to one in the art, and include covalent or noncovalent 
attachment to a solid support. Similarly, types of suitable solid supports are well known to 
one in the art, and include beads, glass, polymers, resins and gels. The following is a 
representative example for preparing oligonucleotides depleted of error-containing 
15 oligonucleotides. Two complementary oligonucleotides (e.g., DNA) are chemically 
synthesized and then hybridized to form duplex oligonucleotides (e.g., double-stranded 
DNA). Alternatively, double-stranded DNA may be enzymatically synthesized (and 
further denatured and reannedled). This mixture is passed over a column with a mismatch 
recognition protein (e. g. , the MutS protein) immobilized on a solid support (such as beads) 
20 in the column. Fragments with an error in either of the oligonucleotides will usually 
contain a mismatch since in most cases the other strand is correct at that position. 
Duplexes containing mismatches will bind to the column and only error-free duplexes will 
be enriched in the flow-through from the column. 

In another embodiment, a gene encoding a mismatch recognition protein 
25 (e.g., the MutS gene) is fused to a gene fragment that encodes a binding domain (for 
instance a chitin-binding domain). The following is another representative example for 
preparing oligonucleotides depleted of error-containing oligonucleotides. The fused 
protein is produced and mixed with a duplex fragment that is produced as described above. 
Duplex molecules with an error in either strand will bind to the fusion protein (e.g. , MutS 
30 fusion protein). After an appropriate incubation, the mixture is passed over a chitin 
column. The fusion protein binds to the column via the chitin. Duplex molecules with 
mismatches are retained on the column, and error-free duplexes flow through. 

The following examples are offered by way of illustration and not by way of 

limitation. 

35 
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EXAMPLES 

EXAMPLE 1 
Synthesis of a 205 bp DNA Fragment From the 
Operator-Binding Region of the lacI Gene 

5 Beta-galactosidase is an enzyme that can convert X-gal from a colorless 

compound into a brilliant blue compound (Manniatis; Sambrook et al., Molecular Cloning: 
A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989). 
The lacI gene encodes a repressor of beta-galactosidase synthesis in E. colL In a cell with 
functional lac repressor, the synthesis of beta-galactosidase is suppressed and colonies 

1 0 grown on X-gal plates are white. If the lac repressor gene is inactive, beta-galactosidase is 
produced and the colonies are a bright blue color. Because the function of the lac repressor 
can be measured with simple, in vivo assays it has been the subject of extensive genetic 
analysis (Markiewicz et al., J. Mol. Biol. 240:421-33, 1994; Suckow et aL, J. Mol. Biol. 
261:421-33, 1996). Based on this work, four G residues in a 205 base pair fragment which 

15 can not be changed without inactivating the protein were chosen. The sequence at these 
residues can thus be determined by assaying for Lac repressor function. 

A 205 base pair segment of the lacl gene with the sequence: 

1 AATTCATAAA GGAGATATCA TATGAAACCG GTAACGTTAT ACGACGTCGC TGAATACGCC 

20 61 GGCGTTTCTT ACCAGACCGT TTCTAGAGTG GTTAACCAGG CTTCACATGT TAGCGCTAAA 
121 ACCCGGGAAA AAGTTGAAGC TGCCATGGCT GAGCTCAACT ACATCCCGAA CCGTGTTGCG 
181 CAGCAGCTGG CTGGTAAACA AAGCT 

is synthesized using a set of overlapping double-stranded oligonucleotides. 

25 The oligonucleotides used to make the gene are prepared using an Oligo 

1000M DNA Synthesizer (Beckman Coulter, Inc, Fullerton, CA) using Beckman 30 nM 
DNA Synthesis Columns. All standard phosphoramidites and ancillary synthesis reagents 
are obtained from Glen Research, Inc. (Sterling, VA). Chemical phosphorylation of the 
oligonucleotides is done with the Chemical Phosphorylation II (Glen Research). 

30 Concentrated ammonia is obtained from Fisher Scientific (Springfield, NJ). 40% 
N-methylamine is obtained from Fluka Chemical Corporation (Milwaukee, WI). After 
cleavage from the solid support, the oligonucleotides are Trityl On purified using Poly-Pak 
Cartridges according to the instruction manual provided by Glen Research. Reagents for 
Trityl On purification are HPLC-grade acetonitrile and water obtained from Burdick & 

35 Jackson (Muskegon, MI). Triethylammonium acetate (TEAA), pH 7.0, and 3% 
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Trifluoroacetic acid in water are obtained from Glen Research. After purification, the 
synthesized oligonucleotides are evaporated to dryness in a SpeedVac (Savant, 
Farmingdale, NY) and resuspended in HPLC grade water. Concentrations of the 
oligonucleotides are determined by reading the 260 nm absorbance on a Pharmacia LKB 
5 Ultrospec HI (Amersham Pharmacia, Upsala, Sweden). 

The oligonucleotides are used to form duplex fragments by drying 
500 pmoles each of the complementary oligonucleotides in a speedvac and resuspending in 
10 microliters TE. A 5 microliter sample of the solution (250 pmoles) is mixed with 
1 0 microliters of 2XSSPE (prepared according to Manniatis), heated to 95°C and cooled to 

1 0 room temperature. 

Duplexes are successively ligated together to make longer fragments until 
the full length product is made. Each ligation consists of 500 picomoles of a pair of 
double-stranded oligonucleotide, 3 microliters of 10X ligation buffer (Fermentas Inc., 
Hanover, MD), 10 units of T4 DNA ligase (product # EL0016, Fermentas) and water to 

15 make a total volume of 30 microliters. All duplexes are ligated together under the same 
conditions. Each ligation mix is incubated at 37°C for 60 minutes, heated to 65°C for 
10 minutes and the fragment isolated by HPLC. 

High performance liquid chromatography (HPLC) is performed on aProStar 
Helix HPLC system from Varian Inc. (Walnut Creek, C A) consisting of two high-precision 

20 high-pressure pumps (ProStar 2 1 5 Solvent Delivery Modules), a column oven (ProStar 5 1 0 
Air Oven), a UV detector (ProStar 320 UV/Vis Detector) and a fraction collector 
(Dynamax FC-1 Fraction Collector), all controlled by Star Chromatography Workstation 
Software (Version 5.31). The column used is a Zorbax Eclipse dsDNA Analysis Column 
(4.6 mm ID x 75 mm, 3.5 micron) equipped with an in line Guard Column (4.6 mm ID x 

25 12.5 mm, 3.5 micron) both from Agilent Technologies, Inc. (Palo Alto, CA). The 
following pre-made buffers are obtained from Varian Inc. (Walnut Creek, CA): Helix 
BufferPak "A" (100 mM Triethylammonium acetate, pH 7.0, 0.1 mM EDTA) and Helix 
BufferPak "B" (100 mM Triethylammonium acetate, pH 7.0, 0,1 mM EDTA with 25% by 
volume acetonitrile). The thermal and gradient conditions for isolating chemically-pure 

30 enriched sequence are calculated using the DHPLC Melt Program 
(http://insertion.stanford.edu/melt.html) available from Stanford University (Palo Alto, 
CA). Elution profiles are monitored using UV detector with absorbance at 260 nm. 

The ligated fragments are dried down from the HPLC buffer and 
resuspended in TE. These fragments are used in a second set of ligation reactions. Several 

35 rounds of ligation followed by purification and fragment isolation are used to build the 
205 base pair fragment of the lad gene. 
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EXAMPLE 2 

Functional Testing of the 205 Base Pair Fragment of the lacI Gene 

The synthetic fragment produced in Example 1 is cloned into the lacI gene 
to test its function. Three micrograms of plasmid vector p WB 1000 (Lehming et al., Proc. 

5 Natl. Acad. Sci. USA, 85:7947-7951, 1988) is digested with restriction enzymes EcoRl 
and Hindin and the vector fragment gel purified using a Strata Prep DNA extraction kit 
(Stratagene product #400766) according to the manufacturers instructions, and resuspended 
in 100 microliters of TE. One microgram of the lad fragment is treated with T4 
polynucleotide kinase, extracted once with phenol and once with chloroform, ethanol 

1 0 precipitated and resuspended in 20 microliters of TE. Five microliters of the cut vector and 
one microliter of the synthetic lad fragment are ligated in a total volume of 1 00 microliters 
using Fermentas T4 DNA ligase according to the manufacturers instructions. The ligation 
mix is extracted once with Strataclean, concentrated and washed twice with 1/1 0 th 
concentration TE and brought to a volume of 10 microliters in 1/1 0 th concentration TE. 

15 One microliter of this mix is transferred into £ colt strain DC 41-2 carrying plasmid 
pWB310 (Lehming et al., EMBO 6:3145-3153, 1987) by electroporation using a BTX 
ECM399 electroporator (Genetronics, Inc., San Diego, CA) according to the manufacturers 
instructions. Colonies are grown overnight on LB plates in the presence of 10 mg/litef 
tetracycline, 200 mg/liter ampicillin, 60 mg/liter X-gal and 300 mg/liter IPTG. Colonies 

20 carrying a plasmid with a functional lad gene are white; those without a functional lad 
gene are blue. 

EXAMPLE 3 

Preparation of 205 bp DNA Fragments Containing 
Diaminopurine at Bases 86, 88, 133, or 178 

25 One common side reaction of oligonucleotide synthesis is the formation of 

diaminopurine from a dG residue in the DNA chain. Modified oligonucleotides containing 
2,6-diaminopurine are obtained from Trilink Biotechnologies (San Diego, CA) and 
incorporated into the 205 bp lad gene fragment. Four samples are prepared as describedin 
Example 1 , with one diaminopurine residue (labeled D below) substituted for a dG residue 

30 in each sample. 
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Oligonucleotide Fragment Name Base Replaced 
5' ACCGTTTCTADAGTGGTTAACCAGG 3' D-T86 86 
5' ACCGTTTCTAGADTGGTTAACCAGG 3' D-T88 88 
5' GGAAAAADTTGAAGCTGCCATGGCT 3' D-T133 133 
5' TTDCGCAGCAGCTGGCTGGTAAACAA 3' D-T178 178 



EXAMPLE 4 

Preparation of 205 bp DNA Fragments Containing a dU at Positions 86 or 133 

A second common side reaction of oligonucleotide synthesis is deamination 
of the N4-amine of deoxycytidine to form a uracil (dU) in the DNA chain. Modified 
oligonucleotides containing uracil (dU) are obtained from Midland Certified Reagent 
. Company (Midland, TX) and incorporated into the 205 bp lad gene fragment. Two 
samples are prepared as described in Example 1, with one uracil residue (labeled dU 
below) substituted for a dC residue in each sample. 

Oligonucleotide Fragment Name Base Replaced 

5' T GAAG C C 7 G GT T AAC CAC TdUT AG AA 3' U-B86 86 

5' AGCTCAGCCATGGCAGCTTCAAdUTT 3' U-B133 133 

EXAMPLES 

Preparation of 205 bp DNA Fragments Containing an 
Abasic Site at Positions 134 or 182 

A third common side reaction of oligonucleotide synthesis is the formation 
of abasic sites by depurination of protected adenosine residues during chain elongation. 
Modified oligonucleotides containing uracil are obtained from Midland Certified Reagent 
Company (Midland, TX) and incorporated into the 205 bp lad gene fragment. Two 
samples are prepared as described in Example 1, with one uracil residue (labeled dU 
below) substituted for a dA residue in each sample. 

Oligonucleotide Fragment Name Base Replaced 

5' AGCTCAGCCATGGCAGCTTCAdUCTT 3' A-B134 134 

5' TTGCGCdUGCAGCTGGCTGGTAAACAA 3 f A-T182 182 
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After synthesis and HPLC purification of the 205 base pair fragments, the 
DNA is treated with Uracil-N-Glycosylase (Epicentre Technologies Corp., Madison, WI) 
according to the manufacturers instructions to remove the uracil base, leaving an apurinic 
site in place of the corresponding A residue in the native 205 base pair fragment. 

5 EXAMPLE 6 

Calculation of Thermal and Gradient HPLC Conditions for lacI Sequence 

The thermal and gradient conditions for isolating chemically-pure enriched 
sequence are calculated using the DHPLC Melt Program 
(http://insertion.stanford.edu/melt.html) available from Stanford University (Palo Alto, 
1 0 C A) and available for license from the Stanford University Office of Technology Licensing 
referring to the docket number S95-024. The 4 base single-stranded region on either end of 
the 205 base pair fragment is removed to give the following 197 base pair sequence. 

lac I Region 

CATAAAGGAGATATCATATGAAACCGGTAACGTTATACGACGTCGCTGAA 
15 TACGCCGGCGTTTCTTACCAGACCGTTTCTAGAGTGGTTAACCAGGCTTC 
ACATGTTAGCGCTAAAACCCGGGAAAAAGTTGAAGCTGCCATGGCTGAGC 
TCAACTACATCCCGAACCGTGTTGCGCAGCAGCTGGCTGGTAAACAA 

The gradients are specified below as percent buffer B at times 1, 2 and 3 
20 (Bl, B2, B3). The gradient is run from Bl to B2 in 0.5 minutes, then B2 to B3 in 3.0 
minutes. 



Conditions 


Temperature (C) 


Bl 


B2 


B3 


1 


53 


50 


59.6 


65 


2 


55 


50 


56.8 


62.2 


3 


57 


50 


54.1 


59.5 


4 


59 


50 


51.4 


56.8 


5 


61 


45 


50 


55.4 



Buffer A and buffer B are as described in Example 1 . 
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EXAMPLE 7 

Determination of the Temperature-Dependent Chromatographic Profiles of 
the Native and Eight Modified lacI Fragments 

The chromatographic behavior of the native lad DNA and the eight 
5 modified lad DNA are measured in response to a range of gradient and temperature 
conditions. The lad DNA is below: 



Name 


Type and Location of Modification 


Pure 


No chemical modification 


D-T86 


2,6-diaminopurine @ position 86 


D-T88 


2,6-diammopurine @ position 88 


D-T133 


2,6-diaminopurine @ position 133 


D-T178 


2,6-diaminopurine @ position 178 


U-B86 


2 J -deoxyuridine @ position 79 


U-B133 


2 9 -deoxyuridine @ position 133 


A-B134 


abasic @ position 134 


A-T182 


abasic @ position 1 82 



25 pmoles of each sample is suspended in 5 jil of HPLC-grade water and 
10 directly chromatographed on a Zorbax Eclipse ds DNA Analysis Column (4.6 mm ID x 75 
mm, 3.5 micron) with an in line Pre-Column (4.6 mm ID x 12.5 mm, 3.5 micron) with 
Buffer A consisting of 100 mM Triethylammonium acetate, pH 7.0, 0.1 mM EDTA and 
Buffer B consisting of 100 mM Triethylammonium acetate, pH 7.0, 0.1 mM EDTA with 
25% by volume acetonitrile. The details of each gradient and temperature condition are as 
15 described in Example 6. 

Each fragment denatures at a temperature that is a function of the strength of 
the duplex structure. The fully base paired native lad sequence forms the most stable 
duplex and denatures under the most stringent conditions. Fragments with base 
modifications form less stable duplexes, denature at a lower temperature and thus show 
20 earlier elution at a given temperature and gradient profile. 
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EXAMPLE 8 

Enzymatic Amplification of a Chemically-synthesized 205 Base Pair lacI 
Gene Carrying Modified Bases 

The fragments produced in Example 3, Example 4 and Example 5 
5 (fragments D-T86, D-T88, D-T133, D-T178, U-B86, U-B133, A-B134, A-T182) are 
amplified using PCR to convert the base pair mismatches in the synthetic fragments into 
base paired errors in the enzymatically produced fragments. The PCR is designed to 
amplify the complete fragment and add sequence using tails on the primers to add cloning 
sites for EcoRl and Hindin restriction enzymes. 

10 Tailed PCR Primer Sequences 

Forward primer: 5 ' -AGGCTGAAACTGGACAATTCATAAAGGAGATATCATATGAAACCG- 
3' 

Reverse primer: 5 ' -CTTCGGAAGATCCTTAGCTTTGTTTACCAGCCAGCTG- 3 ' 

15 The PCR conditions for amplification of the fragments produced in Example 

3, Example 4 and Example 5 are described in the table below. All the components are 
combined and vortexed to ensure good mixing, and centrifuged. Aliquots are then 
distributed into PCR tubes as shown in the following table: 

COMPONENT VOLUME 

Pfu 10X Buffer (Cat. No.600153-82, Stratagene, Inc., La Jolla, CA) 5 |iL 

10 mMdNTP Mix 1 

Forward primer (10 pmol/jiL) 1 \xL 

Reverse primer (10 pmol/|jL) 1 jiL 

H 2 0 36 \xL 

Synthetic DNA Fragment 5 |iL 

PFUTurbo™ (Cat. No. 600250, Stratagene, Inc., La Jolla, CA) 1 ^L 



20 



The PCR tubes are placed into a thermocycler (MJ Instruments) and the 
temperature cycling program was initiated. The cycling program parameters are shown in 
the table below: 
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STEP 


TEMP 


TIME 


1 


94C 


2 minutes 


o 
e. 




l minute 


3 


55C 


1 minute 


4 


72C 


1 minute 


5 


Go to Step 2, 3 OX 




6 


72C 


10 minutes 


7 


4C 


Forever 



The PCR products are purified using the QIAquick PCR Purification Kit (Qiagen Inc., 
Valencia, CA) according to the manufacturers instructions and resuspended in 10 
microliters of TE. 

5 EXAMPLE 9 

Functional Test of 205 Base Pair Enzymatically Amplified Fragments of the 
lacI Gene Carrying Modified Bases 

, The enzymatic fragments produced in Example 8 are cloned into the lad 
gene to test their biological function. Ten micrograms of plasmid vector pWBlOOO 

1 0 (Lehming et al., PNAS 85 :7947-795 1 , 1 988) and each of the PCR reactions from Example 
8 is digested with restriction enzymes EcoRl and Hindlll . Each of the cut amplification 
products and the vector fragment are gel purified using a Strata Prep DNA extraction kit 
(Stratagene product #400766) according to the manufacturers instructions, and resuspended 
in 100 microliters of TE. Each of the cut PCR reactions and one microgram of each lad 

15 fragment is treated with T4 polynucleotide kinase, extracted once with phenol and once 
with chloroform, ethanol precipitated and resuspended in 20 microliters of TE. Five 
microliters of the cut vector and the entire sample of the amplified DNA are ligated in a 
total volume of 100 microliters using New England Biolabs T4 DNA ligase according to 
the manufacturers instructions. The ligation mix is extracted once with Strataclean, 

20 concentrated and washed twice with 1 /l 0 th concentration TE and brought to a volume of 1 0 
microliters in 1/10* concentration TE. One microliter ofthis mix is transferred into E. coli 
strain DC 41 -2 carrying plasmid pWB3 1 0 (Lehming et al., EMBO 6:3 145-3 153,1 987) by 
electroporation using a BTX ECM399 electroporator according to the manufacturers 
instructions. Colonies are grown overnight on LB plates in the presence of 10 mg/liter 

25 tetracycline, 200 mg/liter ampicillin, 60 mg/liter X-gal and 300 mg/liter IPTG. Colonies 
carrying a plasmid with a functional lacI gene are white; those without a functional lacI 
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gene are blue. Each modified fragment is characterized by the frequency of blue colonies 
relative to the frequency of blue colonies derived from clones of the native synthetic lad 
fragment as described in Example 2. 

EXAMPLE 10 

5 Enrichment of Native lacI Fragments From Mixtures of Native and 

Enzymatically Amplified Fragments of the lacI Gene Carrying Modified 

Bases by Preparative HPLC 

The ability of the chromatographic technique to enrich a population of 
enzymatically amplified base paired DNA composed of "correct" DNA in the presence of 

10 "incorrect 55 DNA is shown by spiking native lad DNA with each of the eight amplified lad 
DNA from Example 8, denaturing and reannealing the mix, and enriching for the correct 
DNA using HPLC. For each of the eight amplified DNA fragments from Example 8 an 
equimolar mixture is prepared of amplified native and amplified fragments by mixing 20 
pmoles of the amplified DNA with 20 pmoles of the amplified native fragment. A fraction 

15 of each mixture is retained for functional testing as described below. The remainder of 
each of these samples is chromatographed using thermal and gradient conditions (identified 
in Example 7) which alter the mobility of the'modified fragments relative to the native 
fragment. For each sample, the peaks are collected with a fraction collector as described in 
Example 1 at the elution time determined in Example 7. Two fractions are collected, one 

20 with a mobility characteristic of the modified DNA fragments and one with a slower 
mobility characteristic of the native DNA fragment. These fractions are dried down and 
cloned as described in Example 9. In parallel, a portion of each of the eight unfractionated 
mixtures is cloned and tested in the same way. The "native fraction 55 fragments show a 
lower number of sequence errors than the original mixtures or the early-eluting fractions, as 

25 indicated by the frequency of blue colonies. 

EXAMPLE 11 

Enrichment of Native lacI Fragments From Mixtures of Native and 
Enzymatically Amplified Fragments of the lacI Gene Carrying Modified 
Bases by Removal of Mismatches With a Mismatch Binding Protein 
30 Immobilized to Magnetic Beads 

The ability of the mismatch binding protein to enrich a population of 
enzymatically amplified base paired DNA composed of "correct 5 ' DNA in the presence of 
"incorrect" DNA is shown by spiking native lad DNA with each of the eight amplified lad 
DNA from Example 8, denaturing and reannealing the mix, and enriching for the correct 
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DNA using mismatch binding protein immobilized to magnetic beads. For each of the 
eight amplified DNA fragments from Example 8 an equimolar mixture is prepared of 
amplified native and amplified fragments by mixing 20 pmoles of the amplified DNA with 
20 pmoles of the amplified native fragment. A fraction of each mixture is retained for 
5 functional testing as described below. 

The remainder of each of these samples are exposed to MutS immobilized 
on magnetic beads (GeneCheck Inc., Fort Collins, CO). The magnetic particles are 
collected on the walls of the tube with a magnet and the supernatant is collected. These 
supernatants are dried down and cloned as described in Example 9. In parallel, a portion of 
10 each of the eight unpurified mixtures are cloned and tested in the same way. The "native 
fraction" fragments show a lower number of sequence errors than the original mixtures or 
the early-eluting fractions, as indicated by the frequency of blue colonies. 

EXAMPLE 12 

Enrichment of Native lacI Fragments From Mixtures of Native and 
1 5 Enzymatically Amplified Fragments of the lacI Gene Carrying Modified 
Bases by Removal of Mismatches With a Mismatch Binding Protein Passed 
Through a Nitrocellulose Filter 

The ability of the mismatch binding protein to enrich a population of 
enzymatically amplified base paired DNA composed of "correct" DNA in the presence of 

20 'Incorrect" DNA is shown by spiking native lad DNA with each of the eight amplified lad 
DNA from Example 8, denaturing and reannealing the mix, and enriching for the correct 
DNA using mismatch binding protein passed through a nitrocellulose filter. For each of 
the eight amplified DNA fragments from Example 8 an equimolar mixture is prepared of 
amplified native and amplified fragments by mixing 20 pmoles of the amplified DNA with 

25 20 pmoles of the amplified native fragment. A fraction of each mixture is retained for 
functional testing as described below. 

The remainder of each of these samples are exposed to MutS (Amersham 
Pharmacia Biotech, Upsala, Sweden) immobiled on to a nitrocellulose filter. A 
nitrocellulose sheet (0.45 micron, Schleicher and Schull, BA85) was pre-wet by floating in 

30 reaction buffer (20 mM Tris HC1, pH 7.6; 5 mM MgCl 2 , 0.1 mM DTT, 0.01 mM EDTA). 
MutS (500 ng/well) in reaction buffer was bound to the nitrocellulose in a 48 well slot 
blotting apparatus (Hoefer Scientific Instruments) over 3 sheets of blotting paper 
(Schleicher and Schull GB002). After 20 min. at room temperature, nitrocellulose was 
blocked with 200 |il/well of 3% horse radish peroxidase (HRP)-free bovine serum albumin 

35 (BS A). After 1 hour, excess blocking solution was pulled through with vacuum and DNA 
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(1 ng and 1 0 ng) was added in 20 ul reaction buffer containing 3% BSA. After 30 min. at 
room temperature, wells were washed 1 time with 1 00 ul reaction buffer. The wash fluids 
were decanted rather than pulled through with vacuum. These supematants are dried down 
and cloned as described in Example 9. In parallel, a portion of each of the eight unpurified 
mixtures are cloned and tested in the same way. The "native fraction" fragments show a 
lower number of sequence errors than the original mixtures or the early-eluting fractions, as 
indicated by the frequency of blue colonies. 

From the foregoing it will be appreciated that, although specific 
embodiments of the invention have been described herein for purposes of illustration, 
various modifications may be made without deviating from the spirit and scope of the 
invention. 
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CLAIMS 

1. A method of depleting in a sample of double-stranded 
oligonucleotides a population of double-stranded oligonucleotides containing mismatched 
bases thereby enriching in said sample a population of double-stranded oligonucleotides 
containing correctly matched bases, comprising the steps of: 

(a) contacting said sample containing double-stranded oligonucleotides 
with a mismatch recognition protein under conditions to permit the protein to interact with 
a double-stranded oligonucleotide containing at least one mismatched base; and 

(b) collecting double-stranded oligonucleotides that have not interacted 
with said mismatch recognition protein, thereby depleting the population of double- 
stranded oligonucleotides containing mismatched bases. 

2. The method of claim 1 5 wherein prior to the step of collecting, 
having an additional step comprising separating said double-stranded oligonucleotide 
containing at least one mismatched base that has interacted with said mismatch recognition 
protein, from double-stranded oligonucleotides that have not interacted with said mismatch 
recognition protein. 

3 . The method of claim 1 wherein the double-stranded oligonucleotides 
of said sample are chemically synthesized. 

4. The method of claim 1 wherein the double-stranded oligonucleotides 
of said sample are enzymatically synthesized. 

5. The method of claim 4, Wherein prior to the step of contacting, 
having additional steps comprising denaturing and reannealing said sample of double- 
stranded oligonucleotides under conditions to permit conversion of the double-stranded 
oligonucleotides first to single-stranded oligonucleotides and then to double-stranded 
oligonucleotides. 

6. The method of claim 1 wherein said mismatch recognition protein is 
immobilized on a solid support. 

7. The method of any one of claims 1, 2, 3, 4, 5 or 6 wherein said 
double-stranded oligonucleotides are double-stranded DNA. 



18 



WO 03/054232 PCT/US02/40138 



8. The method of claim 7 wherein the DNA is a gene or a portion of a 

gene. 
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